Issues Related to Sampling Techniques for Network Traffic Dataset
نویسندگان
چکیده
Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators. Due to imbalances in dataset, machine learning algorithms may give biased or false results leading to serious degradation in performance of these algorithms. Since the network dataset is huge, during training machine learning algorithm takes more time and hence sampling should be used to reduce the training time. But using sampling may cause loss of information which should be taken care off while obtaining the samples. In this paper various sampling techniques have been analysed for loss of information and imbalances during sampling of network traffic data. Data set is collected from the Panjab University network. Various parameters like missing classes in samples, probability of sampling of the different instances have been considered for comparison.
منابع مشابه
Sampling Based Approaches to Handle Imbalances in Network Traffic Dataset for Machine Learning Techniques
Network traffic data is huge, varying and imbalanced because various classes are not equally distributed. Machine learning (ML) algorithms for traffic analysis uses the samples from this data to recommend the actions to be taken by the network administrators as well as training. Due to imbalances in dataset, it is difficult to train machine learning algorithms for traffic analysis and these may...
متن کاملBehavioral Analysis of Traffic Flow for an Effective Network Traffic Identification
Fast and accurate network traffic identification is becoming essential for network management, high quality of service control and early detection of network traffic abnormalities. Techniques based on statistical features of packet flows have recently become popular for network classification due to the limitations of traditional port and payload based methods. In this paper, we propose a metho...
متن کاملPerformance of OpenDPI in Identifying Sampled Network Traffic
The identification of the nature of the traffic flowing through a TCP/IP network is a relevant target for traffic engineering and security related tasks. Despite the privacy concerns it arises, Deep Packet Inspection (DPI) is one of the most successful current techniques. Nevertheless, the performance of DPI is strongly limited by computational issues related to the huge amount of data it needs...
متن کاملConvergence Optimization of Backpropagation Artificial Neural Network Used for Dichotomous Classification of Intrusion Detection Dataset
There are distinguished two categories of intrusion detection approaches utilizing machine learning according to type of input data. The first one represents network intrusion detection techniques which consider only data captured in network traffic. The second one represents general intrusion detection techniques which intake all possible data sources including host-based features as well as n...
متن کاملDTMP: Energy Consumption Reduction in Body Area Networks Using a Dynamic Traffic Management Protocol
Advances in medical sciences with other fields of science and technology is closely casual profound mutations in different branches of science and methods for providing medical services affect the lives of its descriptor. Wireless Body Area Network (WBAN) represents such a leap. Those networks excite new branches in the world of telemedicine. Small wireless sensors, to be quite precise and calc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013